CORE score Flash News List | Blockchain.News
Flash News List

List of Flash News about CORE score

Time Details
2026-01-07
23:01
Karpathy Reveals nanochat Scaling-Law Breakthrough: Compute-Optimal LLMs on 8x H100 for about $100, CORE-Score Benchmarks vs GPT-2/3

According to @karpathy, nanochat’s first public miniseries v1 demonstrates compute-optimal LLM training across model sizes at fixed FLOPs with an end-to-end pipeline and reproducible scripts. source: @karpathy on X Jan 7, 2026; nanochat GitHub discussion #420 He reports nanochat reproduces Chinchilla-like scaling with equal exponents on parameters and data near 0.5 and a single compute-independent constant of about 8 tokens per parameter versus 20 reported in Chinchilla. source: @karpathy on X Jan 7, 2026; Hoffmann et al. 2022 Chinchilla The sweep from d10 to d20 achieves non-intersecting training curves at batch sizes around 2^19 (about 0.5M) on one 8x H100 node without gradient accumulation. source: @karpathy on X Jan 7, 2026 He aligns nanochat with GPT-2 and estimated GPT-3 using the CORE score for an objective cross-series comparison. source: @karpathy on X Jan 7, 2026; DCLM paper (CORE score) The total experiment cost is about $100 for roughly 4 hours on 8x H100, with all tuning and code pushed to master for reproduction via scaling_laws.sh and miniseries.sh. source: @karpathy on X Jan 7, 2026; nanochat GitHub discussion #420 This implies roughly $3.1 per H100 GPU-hour for the described run, offering a live reference for pricing compute in AI workloads. source: calculation based on @karpathy on X Jan 7, 2026 For crypto markets, decentralized GPU networks that price or facilitate GPU time make these cost and scaling benchmarks directly relevant to workload pricing and benchmarking on networks like Render Network (RNDR) and Akash Network (AKT). source: Render Network documentation; Akash Network documentation

Source